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Preface  &  Acknowledgements 


Welcome  to  our  Tenth  Annual  Acquisition  Research  Symposium!  We  regret  that  this 
year  it  will  be  a  “paper  only”  event.  The  double  whammy  of  sequestration  and  a  continuing 
resolution,  with  the  attendant  restrictions  on  travel  and  conferences,  created  too  much 
uncertainty  to  properly  stage  the  event.  We  will  miss  the  dialogue  with  our  acquisition 
colleagues  and  the  opportunity  for  all  our  researchers  to  present  their  work.  However,  we 
intend  to  simulate  the  symposium  as  best  we  can,  and  these  Proceedings  present  an 
opportunity  for  the  papers  to  be  published  just  as  if  they  had  been  delivered.  In  any  case,  we 
will  have  a  rich  store  of  papers  to  draw  from  for  next  year’s  event  scheduled  for  May  14-15, 
2014! 


Despite  these  temporary  setbacks,  our  Acquisition  Research  Program  (ARP)  here  at 
the  Naval  Postgraduate  School  (NPS)  continues  at  a  normal  pace.  Since  the  ARP’s 
founding  in  2003,  over  1 ,200  original  research  reports  have  been  added  to  the  acquisition 
body  of  knowledge.  We  continue  to  add  to  that  library,  located  online  at 
www.acquisitionresearch.net,  at  a  rate  of  roughly  140  reports  per  year.  This  activity  has 
engaged  researchers  at  over  70  universities  and  other  institutions,  greatly  enhancing  the 
diversity  of  thought  brought  to  bear  on  the  business  activities  of  the  DoD. 

We  generate  this  level  of  activity  in  three  ways.  First,  we  solicit  research  topics  from 
academia  and  other  institutions  through  an  annual  Broad  Agency  Announcement, 
sponsored  by  the  USD(AT&L).  Second,  we  issue  an  annual  internal  call  for  proposals  to 
seek  NPS  faculty  research  supporting  the  interests  of  our  program  sponsors.  Finally,  we 
serve  as  a  “broker”  to  market  specific  research  topics  identified  by  our  sponsors  to  NPS 
graduate  students.  This  three-pronged  approach  provides  for  a  rich  and  broad  diversity  of 
scholarly  rigor  mixed  with  a  good  blend  of  practitioner  experience  in  the  field  of  acquisition. 
We  are  grateful  to  those  of  you  who  have  contributed  to  our  research  program  in  the  past 
and  encourage  your  future  participation. 

Unfortunately,  what  will  be  missing  this  year  is  the  active  participation  and 
networking  that  has  been  the  hallmark  of  previous  symposia.  By  purposely  limiting 
attendance  to  350  people,  we  encourage  just  that.  This  forum  remains  unique  in  its  effort  to 
bring  scholars  and  practitioners  together  around  acquisition  research  that  is  both  relevant  in 
application  and  rigorous  in  method.  It  provides  the  opportunity  to  interact  with  many  top  DoD 
acquisition  officials  and  acquisition  researchers.  We  encourage  dialogue  both  in  the  formal 
panel  sessions  and  in  the  many  opportunities  we  make  available  at  meals,  breaks,  and  the 
day-ending  socials.  Many  of  our  researchers  use  these  occasions  to  establish  new  teaming 
arrangements  for  future  research  work.  Despite  the  fact  that  we  will  not  be  gathered 
together  to  reap  the  above-listed  benefits,  the  ARP  will  endeavor  to  stimulate  this  dialogue 
through  various  means  throughout  the  year  as  we  interact  with  our  researchers  and  DoD 
officials. 

Affordability  remains  a  major  focus  in  the  DoD  acquisition  world  and  will  no  doubt  get 
even  more  attention  as  the  sequestration  outcomes  unfold.  It  is  a  central  tenet  of  the  DoD’s 
Better  Buying  Power  initiatives,  which  continue  to  evolve  as  the  DoD  finds  which  of  them 
work  and  which  do  not.  This  suggests  that  research  with  a  focus  on  affordability  will  be  of 
great  interest  to  the  DoD  leadership  in  the  year  to  come.  Whether  you’re  a  practitioner  or 
scholar,  we  invite  you  to  participate  in  that  research. 

We  gratefully  acknowledge  the  ongoing  support  and  leadership  of  our  sponsors, 
whose  foresight  and  vision  have  assured  the  continuing  success  of  the  ARP: 
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Abstract 

DoD  acquisition  is  an  extremely  complex  system,  comprised  of  myriad  stakeholders, 
processes,  people,  activities,  and  organizational  structures.  Processes  within  this  complex 
system  are  encumbered  by  the  continuous  development  of  large  amounts  of  unstructured 
and  unformatted  acquisition  program  data,  difficult  to  aggregate  across  the  “enterprise.”  Yet 
acquisition  analysts  and  decision-makers  must  analyze  all  types  and  spectrums  of  the 
available  data  to  obtain  a  complete  and  comprehensible  picture.  This  can  be  a  daunting  task. 
We  have  applied  a  data-driven  automation  system  and  methodology,  namely,  lexical  link 
analysis  (LLA),  to  facilitate  acquisition  researchers  and  decision-makers  to  recognize 
important  connections  (concepts)  that  form  patterns  derived  from  dynamic,  ongoing  data 
collection,  analysis,  and  decision  making.  LLA  technology  and  methodology  is  used  to 
uncover  and  display  relationships  among  competing  programs  and  Navy-driven 
requirements.  In  the  past  year,  we  tested  our  method  using  samples  of  acquisition  data  for 
visualization  and  validity.  LLA  successfully  discovered  statistically  significant  correlations,  and 
automatically  extracted  lexical  links,  thus  improving  acquisition  professionals’  knowledge  of 
their  data.  This  might  have  otherwise  required  expensive  manpower  to  perform.  We  also 
developed  LLA  into  a  web  service  via  several  use  cases  for  large-scale  LLA  applications.  In 
this  paper,  we  show  how  to  apply  the  LLA  web  service  to  the  Acquisition  Visibility  Portal, 
which  is  a  critical  tool  to  provide  the  DoD-wide  acquisition  community  with  authoritative  and 
accurate  data  services.  The  resulting  methodology  could  reduce  the  workload  of  decision¬ 
makers  and  achieve  improved  purchasing  decisions,  serving  to  improve  the  long-term 
success  of  acquisition  strategies. 
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Introduction 

Acquisition  research  has  increased  in  component,  organizational,  technical,  and 
management  complexity.  It  is  difficult  for  acquisition  professionals  to  remain  continuously 
aware  of  their  decision-making  domains  because  information  is  overwhelming  and  dynamic. 
According  to  the  Chairman  of  the  Joint  Chiefs  of  Staff  Instruction  for  Joint  Capabilities 
Integration  and  Development  System  (JCIDS;  CJCS,  2009),  there  are  three  key  processes 
in  the  DoD  that  must  work  in  concert  to  deliver  the  capabilities  required  by  the  warfighters: 
the  requirements  process;  the  acquisition  process;  and  the  Planning,  Programming,  Budget, 
and  Execution  (PPBE)  process. 

Each  process  produces  a  large  amount  of  data  in  an  unstructured  manner;  for 
example,  the  warfighters’  requirements  are  documented  in  Universal  Joint  Task  Lists 
(UJTLs),  Joint  Capability  Areas  (JCAs),  and  Urgent  Need  Statements  (UNSs).  These 
requirements  are  processed  in  the  JCIDS  to  become  projects  and  programs,  which  should 
result  in  products  such  as  weapon  systems  that  meet  the  warfighters’  needs.  Program  data 
are  stored  in  the  Defense  Acquisition  System  (DAS).  Programs  are  divided  into  Major  DoD 
Acquisition  Programs  (MDAPs),  Acquisition  Category  II  (ACATII),  and  so  forth.  Program 
Elements  (PEs)  are  the  documents  used  to  fund  programs  yearly  through  the  congressional 
budget  justification  process.  Data  is  too  voluminous,  too  unformatted,  and  too  unstructured 
to  be  easily  digested  and  understood — even  by  a  team  of  experienced  acquisition 
professionals. 

On  a  conceptual  level,  our  first  question  is  as  follows:  How  can  the  information  that 
emerges  from  the  acquisition  process  be  used  to  produce  overall  awareness  of  the  fit 
between  programs,  projects,  systems,  and  of  the  needs  for  which  they  were  intended? 


In  precise  terms,  we  observed  that  there  were  three  important  processes  that 
seemed  fundamentally  disconnected.  Specifically,  they  were  the  congressional  budgeting 
justification  process  (such  as  information  contained  within  the  PEs),  the  acquisition  process 
(such  as  information  in  the  MDAP  and  ACATII),  and  the  warfighters’  requirements  (such  as 
information  in  UNSs  and  in  UJTLs),  as  shown  in  Figure  1 .  Yet,  these  were  not  analyzed  and 
compared  together  in  a  dynamic,  holistic  methodology  that  could  keep  pace  with  changes 
and  reflect  patterns  of  relationships.  In  the  past  three  years,  we  employed  the  lexical  link 
analysis  (LLA)  automation  methodology  to  analyze  the  data  in  three  areas,  illustrated  in 
Figure  1 . 
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Figure  1.  Determining  Business  Processes  Links  From  Requirements  to 
DoD  Budget  Justification  to  Final  Products 
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In  the  past,  we  have  explored  how  analytic  and  visualization  tools  such  as  LLA  can 
help  detect  data  inconsistency  and  gaps  (bad  data;  see  Research  Status  section).  We  have 
further  systematically  improved  our  understanding  of  the  quality  of  the  data  by  automatically 
discovering  new  patterns  that  were  previously  unknown,  and  identified  data  dependencies 
that  might  be  indicators  for  program  or  investment  performances.  However,  much  more  work 
is  needed  in  this  area  as  well  as  continued  in-depth  analysis  performed  at  the  different 
levels  of  the  Acquisition  Visibility  Portal  (AVP).  AVP  is  a  critical  tool  that  provides  the  DoD- 
wide  acquisition  community  with  authoritative  and  accurate  data  services  via  interfaces  to 
DTIC  and  DAMIR  for  programs  (e.g.,  MDAPs,  ACATIIs)  with  milestones,  costs,  schedules 
and  performance  data,  Selected  Acquisition  Reports  (SARs),  and  Acquisition  Strategy 
Reports  (ASRs),  among  others. 

We  seek  to  show  how  LLA  can  be  adapted  to  the  AVP’s  ongoing  requirements  and 
continuous  improvement  of  DoD  data  quality  and  decision-making. 

Methodology 

Overview  of  Lexical  Link  Analysis 

As  in  military  operations,  where  the  term  situational  awareness  was  coined,  we  note 
that  our  efforts  can  inform  awareness  of  analyzed  data  in  a  unique  way  that  helps  improve  a 
decision-maker’s  understanding  or  awareness  of  its  content.  We  therefore  define  awareness 
as  the  cognitive  interface  between  decision-makers  and  a  complex  system,  expressed  in  a 
range  of  terms  or  “features,”  or  specific  vocabulary  or  “lexicon,”  to  describe  the  attributes 
and  surrounding  environment  of  the  system.  Specifically,  LLA  is  a  form  of  text  mining  in 
which  word  meanings  represented  in  lexical  terms  (e.g.,  word  pairs)  can  be  represented  as  if 
they  are  in  a  community  of  a  word  network. 

Link  analysis  “discovers”  and  displays  a  network  of  word  pairs.  These  word  pair 
networks  are  characterized  by  one-,  two-,  or  three-word  themes.  The  weight  of  each  theme 
is  determined  by  its  frequency  of  occurrence.  Figure  2  shows  a  visualization  of  common 
lexical  links  shared  between  Systems  1  and  2,  shown  in  the  red  box.  Unlinked,  outer  vectors 
(outside  the  red  box)  indicate  unique  system  features.  For  example,  Figure  3  shows  the 
information  from  three  categories  that  can  be  compared,  and  Figure  4  shows  the  information 
from  two  time  periods  that  can  be  compared. 

Each  node,  or  word  hub,  represents  a  system  feature,  and  each  color  refers  to  the 
collection  of  lexicon  links  (features)  that  describes  a  concept  or  theme.  The  overlapping  area 
nodes  are  lexical  links.  What  is  unique  here  is  that  LLA  constructs  these  linkages  via 
intelligent  agent  technology  using  social  network  grouping  methods. 

The  closeness  of  the  systems  in  comparison  can  be  visually  examined  or  examined 
using  the  Quadratic  Assignment  Procedure  (QAP;  Hubert  &  Schultz,  1976;  e.g.,  in  UCINET; 
Borgatti,  Everett,  &  Freeman,  2002)  to  compute  the  correlation  and  analyze  the  structural 
differences  in  the  two  systems,  as  shown  in  Figure  5. 
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Figure  2.  Comparing  Two  Systems  Using  LLA 


Figure  3.  Comparing  Three  Categories  Using  LLA 
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Figure  4.  Comparing  Two  Time  Periods 
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Figure  5.  QAP  Correlation  via  UCINET 


Figure  6  shows  a  visualization  of  LLA  with  connected  keywords  or  concepts  as 
groups  or  themes.  Words  are  linked  as  word  pairs  that  appear  next  to  each  other  in  the 
original  documents.  Different  colors  indicate  different  clusters  of  word  groups.  They  were 
produced  using  a  link  analysis  method — a  social  network  grouping  method  (Girvan  et  al., 
2001 )  where  words  are  connected,  as  shown  in  a  single  color,  as  if  they  are  in  a  social 
community.  A  “hub”  is  formed  around  a  word  centered  or  connected  with  a  list  of  other 
words  (“fan-out”  words)  centered  on  other  hub  words.  For  instance,  Figure  7  shows  a 
detailed  view  of  a  theme  or  word  group  in  Figure  6:  the  words  “analysis,  research,  approach 
are  connected  and  centered  around  other  related  words.  We  use  three  words  such  as 
“analysis,  research,  approach”  to  label  a  group. 


ACQUISITION  RESEARCH  PROGRAM: 
CREATING  SYNERGY  FOR  INFORMED  CHANGE 


- 157 


Figure  6.  Word  and  Term  of  Themes  Discovered  and  Shown  in  Colored  Groups 


Figure  7.  A  Detailed  View  of  a  Theme  or  Word  Group  From  Figure  6 


The  detailed  steps  of  LLA  processing  include  applying  collaborative  learning  agents 
(CLAs)  and  generating  visualizations,  including  a  lexical  network  visualization  via  AutoMap 
(2009),  radar  visualization,  and  matrix  visualization  (Zhao,  Gallup,  &  MacKinnon,  2010).  The 
following  are  the  steps  for  performing  an  LLA: 

•  Read  each  set  of  documents. 
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•  Select  feature-like  word  pairs. 

•  Apply  a  social  network  community  finding  algorithm  (e.g.,  Newman  grouping 
method;  Girvan  et  al.,  2001 )  to  group  the  word  pairs  into  themes.  A  theme 
includes  a  collection  of  lexical  word  pairs  connected  to  each  other. 

•  Compute  a  “weight”  for  a  theme  for  the  information  of  a  time  period,  that  is, 
how  many  word  pairs  belong  to  a  theme  for  that  time  period  and  for  all  time 
periods. 

•  Sort  theme  weights  by  time,  and  study  the  distributions  of  these  themes  by 
time. 

Business  Problems  That  LLA  Addresses 

General  areas  that  LLA  usually  informs  are  the  following: 

•  Discovering  themes  and  topics  in  the  unstructured  documents  and  sorting  the 
importance  of  the  themes 

•  Discovering  social  and  semantic  networks  of  organizations  that  were 
involved,  comparing  the  two  networks  to  obtain  insights  to  answer  the 
following  questions: 

o  Demonstrating  what  were  the  organizations  involved  in  the  important 
themes 

o  Illustrating  how  semantic  networks  might  suggest  improved  potential 
collaboration  when  compared  to  social  networks 

Social  and  Semantic  Networks  Analysis 

Current  research  of  social  network  analysis  mostly  focuses  on  people  or 
organizations  of  direct  associations,  regardless  of  the  contents  linked.  The  so-called  study  of 
centrality  (Girvan,  2002;  Feldman,  2007)  has  been  a  focal  point  for  the  social  network 
structure  study.  Finding  the  centrality  of  a  network  lends  insight  into  the  various  roles  and 
groupings  such  as  the  connectors  (e.g.,  mavens,  leaders,  bridges,  isolated  nodes),  the 
clusters  (and  who  is  in  them),  the  network  core,  and  its  periphery.  We  have  been  working 
toward  two  areas  of  innovations  in  the  network  analysis: 

•  Extracting  social  networks  based  on  the  entity  extraction 

•  Extracting  semantic  networks  based  on  the  contents  and  word  pairs  using 
LLA 

•  Applying  characteristics  and  centrality  measures  from  the  semantic  networks 
and  social  networks  to  predict  latent  properties  such  as  emerging  leadership, 
for  example,  emerging  techniques  that  might  dominate,  in  the  social 
networks.  These  characteristics  are  further  categorized  into  themes  and  time- 
lined  trends  for  informed  prediction  of  future  events. 

Implementation  Details 

In  the  past  year,  we  continued  our  efforts  at  the  Naval  Postgraduate  School  (NPS)  by 
using  CLAs  (Quantum  Intelligence  [Ql],  2009)  and  expanded  to  other  tools,  including 
AutoMap  (Center  for  Computational  Analysis  of  Social  and  Organizational  Systems 
[CASOS],  2009)  for  improved  visualizations.  Results  from  these  efforts  arose  from 
leveraging  intelligent  agent  technology  via  an  educational  license  with  Quantum  Intelligence, 
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Inc.  CLA  is  a  computer-based  learning  agent,  or  agent  collaboration,  capable  of  ingesting 
and  processing  data  sources. 

We  have  been  generating  visualizations  including  a  lexical  network  visualization 
using  various  open  source  tools.  We  began  by  using  the  Organizational  Risk  Assessment 
(ORA;  CASOS,  2009)  tool  and  expanded  to  other  tools.  For  example,  in  the  past  year,  we 
developed  3-D  network  views  using  Pajek  (201 1)  and  X3D  (2011).  We  also  developed  our 
visualizations  Radar  view  and  Match  view  (Zhao  et  al.,  2010). 

LLA  uses  a  computer-based  learning  agent  called  Collaborative  Learning  Agents 
(CLA;  Ql,  2009)  to  employ  an  unsupervised  learning  process  that  separates  patterns  and 
anomalies.  CLA  is  a  computer-based  learning  agent,  or  agent  collaboration,  capable  of 
ingesting  and  processing  data  sources,  leveraged  via  an  educational  license  with  Quantum 
Intelligence,  Inc.  The  unsupervised  agent  learning  is  implemented  by  indexing  each  set  of 
documents  separately  and  in  parallel  using  multiple  learning  agents.  Multiple  agents  can 
work  collaboratively  and  in  parallel.  We  set  up  a  cluster  utilizing  Linux  servers  in  the  NPS 
High  Performance  Computing  Center  (HPC)  to  handle  the  large-scale  data  and  secure 
environment  in  the  NPS  Secure  Technology  Battle  Laboratory  (STBL). 

Relations  to  Other  Methods 

The  LLA  approach  is  more  properly  related  to  Latent  Semantic  Analysis  (LSA; 
Dumais,  Furnas,  Landauer,  Deerwester,  &  Harshman,  1988)  and  Probabilistic  Latent 
Semantic  Analysis  (PLSA).  In  the  LSA  approach,  a  term-document  matrix  is  the  starting 
point  for  analysis.  The  elements  of  the  term-document  or  feature-object  (term  as  feature  and 
document  as  object)  matrix  are  the  occurrences  of  each  word  in  a  particular  document,  that 
is,  A  =  [ciij],  where  ai;-  denotes  the  frequency  in  which  term  j  occurs  in  document  /.  The  term- 
document  matrix  is  usually  sparse.  LSA  uses  singular  value  decomposition  (SVD)  to  reduce 
the  dimensionality  of  the  term-document  matrix.  SVD  cannot  be  applied  to  the  cases  where 
the  vocabulary  (the  unique  number  of  terms)  in  the  document  collection  is  large.  LSA  has 
been  widely  used  to  improve  information  indexing,  search/retrieval,  and  text  categorization. 

A  recent  development  related  to  this  method  is  called  latent  Dirichlet  allocation  (LDA; 
Blei,  Ng,  &  Jordan,  2003),  which  is  a  generative  probabilistic  model  of  a  corpus.  In  LDA,  a 
document  is  considered  to  be  composed  of  a  collection  of  words — a  “bag  of  words,”  where 
word  order  and  grammar  are  not  considered  important.  The  basic  idea  is  that  documents 
are  represented  as  random  mixtures  over  latent  topics,  where  each  topic  is  characterized  by 
a  statistical  distribution  (Dirichlet  distribution)  over  the  corpus.  Our  theme  generation  from 
LLA  is  different  than  LDA,  in  which  a  collection  of  lexical  terms  are  connected  to  each  other 
semantically,  as  if  they  are  in  a  social  community,  and  social  network  grouping  methods  are 
used  to  group  the  words,  and  unlike  LSA,  our  method  is  easily  scaled  to  analyze  a  large 
vocabulary  and  is  generalizable  to  any  sequential  data. 

Anticipated  Benefits 

Our  LLA  method  provides  the  solutions  to  meet  the  critical  needs  of  the  acquisition 
research.  The  key  advantage  is  to  provide  an  innovative  near  real-time  self-awareness 
system  to  transfer  diversified  data  services  into  strategic  decision-making  knowledge, 
specifically  through  the  following: 

•  Automation:  High  correlation  of  LLA  results — with  the  link  analysis  done  by 
human  analysts — makes  it  possible  to  save  human  power  and  improve 
responsiveness.  Automation  is  achieved  via  computer  program  or  software 
agents  to  perform  LLA  frequently,  and  in  near  real-time. 
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•  Discovery:  LLA  “discovers”  and  displays  a  network  of  word  pairs.  These  word 
pair  networks  are  characterized  by  one,  two,  or  three  word  themes.  The 
weight  of  each  theme  is  determined  based  on  its  frequency  of  occurrence.  It 
may  also  discover  blind  spots  of  human  analysis  that  are  caused  by  the 
overwhelming  data  for  human  analysts  to  consider. 

•  Validation:  LLA  may  provide  different  perspectives  of  links.  In  the  acquisition 
context,  links  discovered  by  human  analysts  may  emphasize  component  and 
part  connections  that  do  not  necessarily  reflect  content  overlaps.  LLA  looks 
for  the  overlapping  of  contents  to  help  identify  improved  affordability  and 
improved  response  to  meeting  warfighter  requirements,  and  achieve  better 
acquisition  decisions.  Consequently,  it  can  provide  improved  results  in  terms 
of  trust  and  quality  of  association  discovery  and  can  help  break  through  the 
taxonomy  of  ignorance  (Denby  &  Gammack,  1999)  and  organizational 
boundaries,  and  help  improve  organizational  reach. 

Research  Status 

Acquisition  Visibility  Portal  Background 

Our  goal  is  to  demonstrate  the  LLA  web  service  for  assisting  the  DoD-wide  effort  of 
integrating  and  maintaining  authoritative  and  accurate  acquisition  data  services  in  both 
legacy  and  new  platforms.  Specifically,  we  wanted  to  analyze  the  data  sources  from  the 
Acquisition  Visibility  Portal  (https://portal.acq.osd.mi )  by  examining  consistency,  correlation, 
and  gaps  among  categories  of  information  for  each  individual  program  listed  in  the  portal. 

One  of  the  biggest  risk  factors  in  defense  acquisition  is  the  unanticipated  effects  of 
program  interactions.  For  example,  ASD(SE)  and  Dahmann  worked  toward  identifying 
interdependence  among  programs  within  a  system  of  systems  (SoS).  Yet,  more  broadly, 
and  as  a  result  of  required  joint  capabilities,  portfolios  often  include  program 
interdependencies  and  system-of-systems  effects.  Ultimately,  the  current  “program-centric” 
acquisition  paradigm  is  increasingly  ill-suited  to  identify  and  address  program  risks  that  arise 
outside  of  program  boundaries.  LLA  can  help  isolate  these  issues  from  the  body  of 
information  collected,  which  have  yet  to  be  effectively  identified. 

To  begin  to  address  this  risk,  we  observed  that  very  little  of  the  information 
generated  for  program  oversight  is  amenable  to  effective  analysis.  Every  major  acquisition 
program’s  milestone  review  generates  volumes  of  information,  which  the  OSD  staff  is 
supposed  to  review  to  determine  if  the  program  is  properly  prepared  for  the  next  milestone. 
Although  they  are  beginning  to  compile  these  artifacts  centrally  to  facilitate  review  and 
analysis,  at  present,  the  only  way  to  analyze  the  information  in  these  artifacts  is  to  read 
them.  With  limitations  on  staffing,  little  time  is  available  to  thoroughly  review  these  artifacts. 
Moreover,  each  functional  community  is  required  to  review  only  the  particular  document  for 
which  it  is  responsible.  For  example,  the  systems  engineering  community  typically  only 
examines  the  Systems  Engineering  Plan  (SEP),  the  test  and  evaluation  community  looks 
only  at  the  Test  &  Evaluation  Master  Plan  (TEMP),  and  the  acquisition  community  looks  at 
the  Acquisition  Strategy  Report  (ASR).  Rarely  do  any  of  these  stakeholders  review  multiple 
reports  or  jointly  discuss  them  to  determine  if  they  are  mutually  consistent  and  to  consider 
inconsistencies  that  might  indicate  programmatic  risk.  There  is  even  less  incentive  and 
opportunity  to  look  for  external  factors  that  would  potentially  invalidate  the  assumptions  that 
underpin  the  basic  cost,  schedule,  and  performance  targets  of  each  program  execution. 

Overlaying  the  concept  maps  for  each  of  the  major  categories  of  artifacts  to  conduct 
a  pair-wise  comparison  might  expose  significant  disconnects  between  them.  We  are 
motivated  by  a  situation  in  which  the  SEP  identifies  a  critical  dependency  between  the 
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program  and  an  external  system,  but  the  TEMP  doesn’t  have  a  corresponding  reference  to 
testing  that  interdependency.  Therefore,  it  may  be  productive  to  compare  the  acquisition 
strategy  to  the  SEP  or  TEMP. 

Results 

LLA  maps  of  these  artifacts  from  one  category  to  another,  for  example,  the  SEP  at 
Milestone  B,  are  significantly  different  from  the  SEP  at  Milestone  C  that  might  indicate  a 
reduction  in  system  functionality  resulting  from  cost  increases  elsewhere.  These  maps, 
reported  as  themes,  concepts,  and  word  pairs,  may  help  cue  a  decision-maker’s  attention  to 
the  potential  issues  and  help  the  decision-maker  consider  specific  and  productive  directions 
for  further  scrutiny. 

To  develop  comprehensive  LLA  maps,  we  first  extracted  a  sample  from  a 
representative  MDAP  from  the  Acquisition  Visibility  Portal  (AVP)  with  categories  of 
information  to  demonstrate  the  method,  as  follows: 

•  SEP:  2  documents,  222  pages 

•  TEMP:  5  documents,  62  pages 

•  ASR:  1 1  documents  including  metrics,  634  pages 

•  SARs:  9  documents,  313  pages 

•  DAES:  19  documents,  447  pages 

•  Milestone  B  2366b  Certification  Acquisition  Decision  Memorandum  (ADM)  12 
documents,  105  pages 

•  APB:  3  documents,  39  pages 

•  TRA:  1  document,  1  page 

Figure  8  lists  the  top  20  themes  discovered  for  comparing  data  for  ASR  and  SEP 
with  the  highest  correlations.  In  Row  2,  there  are  299  word  pairs  for  the  two  sources 
together  classified  in  Theme  117(E);  47  of  them  appear  in  both  sources,  indicating  potential 
feature  overlaps.  The  correlation  is  the  ratio=47/299=0.157  which  indicates  15.7%  of  the 
features  represented  as  word  pairs  shared  in  both  artifacts.  As  a  detail  shown  in  Figure  9, 
part  of  299  word  pairs  in  Theme  1 17(E)  are  visualized  in  red,  yellow,  and  green  links, 
representing  the  shared  word  pairs,  unique  ones  to  ASR  and  SEP,  respectively. 
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1  Theme  Id 

All  Sources 

ASR 

SEP 

Overlap 

Correlation 

2  117(E) 

299 

201 

51 

47 

0,157 

3  347(P) 

481 

346 

67 

68 

0,141 

4  395(P) 

500 

330 

102 

68 

0,136 

5  130(P) 

590 

428 

89 

73 

0,124 

6  281(P) 

469 

372 

42 

0.117 

7  210(P) 

570 

400 

105 

0,114 

S  298(P) 

599 

348 

184 

67 

0.112 

9  388(  P) 

508 

381 

73 

0,106 

10  263(P) 

666 

517 

79 

0.105 

11  368(P) 

669 

472 

127 

0.105 

12  |330(P} 

546 

391 

99 

56 

0,103 

13  147(E) 

234 

181 

29 

24 

0.103 

14  224(E) 

331 

236 

62 

33 

0.100 

IS  jl44(P) 

490 

350 

92 

48 

0.098 

16  |270(P) 

502 

371 

82 

49 

0.098 

17  235(E) 

431 

329 

60 

42 

0,097 

18  245(E) 

281 

215 

39 

27 

0,096 

19  113(E) 

334 

245 

57 

32 

0,096 

20  SlO(P) 

586 

441 

90 

■  55 

0,094 

21  182(A) 

197 

157 

22 

18 

0,091 

Figure  8.  Themes  for  Comparing  SEP  and  ASR,  Sorted  According  to  Correlation 

Ascending 

Figure  9  shows  that  there  are  concepts  related  to  these  word  nodes  that  appear 
uniquely  to  the  ASR  or  SEP. 

Since  the  SEP  document  is  supposed  to  support  the  ASR,  the  illustrations  and 
visualizations  of  it  might  inform  acquisition  professionals  about  why  concepts  in  the  SEP 
were  missing  from  the  ASR,  and  vice  versa. 
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Figure  9.  Detail  of  Word  Pairs  for  Theme  117(E):  Red  Links  for  Shared  Word  Pairs  for 
SEP  and  ASR  (Yellow  Links  for  Unique  Word  Pairs  Unique  to  ASR,  and  Green 
links  for  Unique  Word  Pairs  Unique  to  SEP) 

Figure  10  lists  the  least  correlated  themes  discovered  for  comparing  data  for  ASR 
and  SEP.  In  Row  2,  there  are  149  word  pairs  for  the  two  sources  together,  classified  in 
Theme  359(E)(A);  four  of  them  appear  in  both  sources  (overlap).  The  correlation  is  the 
ratio=4/1 49=0.027.  A  detail  shown  in  Figure  9,  part  of  149  word  pairs  in  Theme  359(A)  are 
visualized  in  red,  yellow,  and  green  links,  representing  the  shared  word  pairs,  unique  ones 
to  the  ASR  and  SEP,  respectively. 
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1  Theme  Id 

All  Sources 

ASR 

SEP 

Overlap 

Correlation 

2  359(A) 

149 

127 

IS 

4 

0.027 

3  390(A) 

173 

150 

IS 

5 

0.029 

4  419(A) 

95 

73 

IS 

4 

0.042 

5  267(A) 

149 

123 

19 

7 

0.047 

6  238(A) 

170 

121 

41 

8 

0.047 

7  293(A) 

231 

184 

36 

11 

0.048 

8  76(E) 

249 

208 

28 

13 

0*052 

9  408(E) 

419 

376 

21 

22 

0.053 

10  287(A) 

223 

187 

24 

12 

0.054 

11  203(E) 

259 

170 

75 

14 

0.054 

12  334(E) 

276 

218 

4  5 

15 

0,054 

13  135(E) 

271 

21S 

38 

15 

0.055 

14  104(A) 

196 

163 

22 

11 

0.056 

IS  63(E) 

314 

253 

43 

18 

0.057 

16  373(P) 

480 

403 

49 

28 

0.053 

17  372(P) 

60S 

509 

62 

37 

0.061 

18  389(A) 

155 

137 

8 

10 

0.065 

19  331(E) 

383 

246 

112 

25 

0,065 

20  205(P) 

561 

420 

104 

37 

0.066 

21 

490 

414 

43 

33 

0.067 

Figure  10.  Themes  for  Comparing  SEP  and  ASR,  Sorted  According  to  Descending 

Correlation 


Figure  1 1 .  Detail  of  Word  Pairs  for  Theme  359(A):  Red  Links  for  Shared  Word  Pairs  for 
SEP  and  ASR  (Yellow  Links  for  Unique  Word  Pairs  Unique  to  ASR,  and  Green 
Links  for  Unique  Word  Pairs  Unique  to  SEP) 
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In  Figure  1 1 ,  there  are  also  concepts  that  are  more  prevalent  in  the  ASR  than  in  the 
SEP.  The  ASR  includes  other  concepts  that  are  not  in  the  SEP  that  might  be  important. 

LLA  also  categorizes  themes  into  popular  (P),  emerging  (E),  and  anomalous  (A). 
Comparing  Figure  8  and  Figure  10,  one  can  see  that  popular  themes  tend  to  have  higher 
correlations  among  data  sources  (ASR  and  SEP),  while  anomalous  themes  tend  to  have 
lower  correlations  among  data  sources. 

For  each  pair  of  comparisons  for  two  categories  of  information,  we  use  the  ratio  of 
the  number  of  word  pairs  that  appear  in  both  categories  and  the  total  number  of  word  pairs 
as  an  overall  correlation  for  each  pair. 

In  Table  1,  the  highlighted  cells  are  the  ones  with  correlation  >  0.06.  The  categories 
DAES,  SARs,  and  SEP  have  higher  overall  correlations  with  other  ones.  The  most 
correlated  two  categories  are  SARs  and  DAES  (correlation  =  0.1 17).  The  category  TEMP 
has  the  lowest  overall  correlations  with  other  categories.  Although  TEMP  and  SEP  were 
both  produced  in  the  test  and  evaluation  community,  the  correlation  between  the  two  is  low 
(0.027). 


Table  1 .  LLA  Correlations  Between  Categories  of  Information 


APB 

ASR 

2366B  Cert 

DAES 

SARs 

SEP 

TEMP 

TRA 

APB 

1.000 

0.007 

0.027 

0.022 

0.080 

0.014 

0.010 

0.005 

ASR 

0.007 

1.000 

0.015 

0.048 

0.025 

0.075 

0.028 

0.001 

2366B  Cert 

0.027 

0.015 

1.000 

0.026 

0.038 

0.026 

0.018 

0.068 

DAES 

0.022 

0.048 

0.026 

1.000 

0.117 

0.073 

0.023 

0.003 

SARs 

0.080 

0.025 

0.038 

0.117 

1.000 

0.044 

0.020 

0.004 

SEP 

0.014 

0.075 

0.026 

0.073 

0.044 

1.000 

0.027 

0.003 

TEMP 

0.010 

0.028 

0.018 

0.023 

0.020 

0.027 

1.000 

0.002 

TRA 

0.005 

0.001 

0.068 

0.003 

0.004 

0.003 

0.002 

1.000 

When  discussing  the  findings  with  the  domain  expert,  it  seems  the  correlation  is 
surprisingly  low  for  DAES  and  SARs.  DAES  and  SARs  reports  are  similar  in  context  and 
content  (both  relate  to  acquisition  performance);  they  would  be  expected  to  have  higher 
correlation.  Further  investigations,  such  as  the  following:  are  needed  to  see  what  might  be 
the  causes  for  the  low  correlation: 

•  To  investigate  if  significantly  different  content  appears  in  the  two  types  of 
reports;  for  example,  DAES  reports  may  include  more  details  than  SARs 
reports. 

•  To  differentiate  the  SAR  and  DAES  reports  by  year  and  compute  the 
correlations  over  time,  to  see  when  the  significant  discrepancies,  that  is,  the 
drop  in  the  correlation,  came  into  the  picture. 

•  To  correlate  the  DAES  or  SAR  reports  over  time  separately  to  see  if  the 
correlation  increases  and  decreases  might  have  to  do  with  the  new  features 
being  introduced  into  the  program,  and  therefore  correlate  to  the  significance 
of  low  or  high  changes  found  in  LLA  with  the  numeric  metrics  such  as  cost, 
schedule,  funding,  and  performance. 

Future  Work 

Since  this  is  the  first  program  to  have  undergone  a  relatively  comprehensive  LLA 
analysis  using  multiple  types  of  acquisition  documents,  the  findings  cannot  be  evaluated  in 
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terms  of  being  “good”  or  “bad,”  “normal”  or  “unusual,”  and  so  forth.  Therefore,  future 
investigation  should  consider  the  following  additional  studies: 

•  Analyze  additional  programs  in  the  AVP,  compute  the  correlation  matrices 
like  Table  1,  and  compare  the  results  to  determine  if  the  correlation  patterns 
are  similar  or  different. 

•  Discuss  the  findings  in  detail  with  the  domain  experts  and  personnel 
associated  with  the  programs  to  see  if  the  correlation  patterns  have 
significance,  as  follows: 

o  if  the  correlation  are  the  indications  for  data  quality  issues  and 

o  if  the  correlation  patterns  have  impacts  for  the  costs,  schedules, 
funding,  and  performance  of  the  programs. 

Conclusion 

In  this  paper,  we  demonstrated  how  to  apply  LLA  to  generate  maps  of  the  acquisition 
artifacts  among  multiple  categories  of  data.  These  maps,  reported  as  themes,  concepts,  and 
word  pairs,  may  help  identify  the  issues  and  offer  specific  and  productive  directions  for 
further  examination  as  to  why  there  are  gaps  among  the  categories  of  information. 
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